Context-based Spatial Description Selection 1 Running head: CONTEXT-BASED SPATIAL DESCRIPTION SELECTION Context-based Spatial Description Selection
نویسنده
چکیده
This paper describes how spatial description selection can be tied to discourse and spatial context to improve text coherence in a natural language generation domain. The paper argues that, given coherence relations and discourse goals, the selection of spatial descriptions which reflect the hearer’s spatial expectations and echo the intended coherence relation can improve text coherence and thereby facilitate better text comprehension. Five coherence relations and algorithms for selecting spatial descriptions are presented and an experiment evaluating this hypothesis is presented. Context-based Spatial Description Selection 3 Context-based Spatial Description Selection One of the biggest challenges for data-to-text Natural Language Generation (NLG) systems describing geographic space is to determine appropriate spatial descriptions (SDs) for areas to be described. Typically such systems aim to report how geo-referenced data (data which has a geographic component) like pollen (Turner, Sripada, Reiter, & Davy, 2006), ice on roads (Turner, Sripada, Reiter, & Davy, 2008) or census data (Thomas & Sripada, 2008) is distributed over a given region, and the task is to come up with a meaningful SD to both (1) approximately locate and (2) characterise an area exhibiting a feature of the data. Locating a given area involves approximately indicating where it lies with cardinal directions (or via another frame of reference), e.g., “in the Southwest”, “above Troy”. Characterising, which is the topic of this paper, involves providing geo-referenced information (e.g., urbanicity, altitude, proximity to the coast, population density, etc.) which characterises the area in a meaningful way. The notion of characterising information follows loosely from the idea that referring expressions (REs, in the context of the ILEX project, (O’Donnell, Cheng, & Hitzeman, 1998) inform as well as refer. In the case of spatial descriptions, the focus is mostly on informing appropriately. The characterising information arises from the geocharacterisation of the area, a data-enrichment process which effectively labels the area with a set of applicable characteristics (e.g., rural, coastal, etc.). This work forms part of the Atlas.txt project which aims to generate texts describing census data (Thomas & Sripada, 2008) which paint a coherent picture of how census data is distributed over a region. The SDs considered here, which will only focus on singular geocharacteristics, or characterising properties (henceforth CPs), occur in sentences which typically have the form “data variable is high in coastal areas” and are absolute locative expressions which involve topological spatial prepositions like “in”. Context-based Spatial Description Selection 4 Research into locative expressions (e.g., (Herskovits, 1985)) distinguishes between the figure, which is the object being located, and the ground, which is the reference object, and in the example above, “data variable” is the figure and “coastal areas” are the ground. Notice that in this case, the figure isn’t an object at all; rather, examples like these simply attribute a property (i.e., a particular value of a data variable, e.g., “high crime”) to a given area (i.e., the ground). In natural language processing (NLP) settings, the interest in locative expressions focuses primarily on their role as spatial REs, since resolving or generating distinguishing references is crucial for correct interpretation and generation of text. The CPs considered here involve exophoric reference since they are drawn from the speaker’s visual perception, which forms their spatio-temporal context. (Kelleher, Costello, & Genabith, 2005) argue that resolving a locative expression involves selecting the figure whose location is best described by the ground, and they employ a salience-based ranking of candidate figures based on how well their locations match that of the ground. However this approach, like many other approaches addressing spatial REs, considers cases with discrete figures and grounds, e.g., “the tree in front of the house”, which are markedly different from the large, vaguely defined areas described from survey perspectives like the map CPs considered here. CPs describing properties of an area like the one above lack both a clear figure, since data values (e.g., high crime levels) are properties themselves rather than embodied objects, and they also lack a crisp ground, since the locative prepositional phrase (e.g., “in coastal areas”) has vague extent. Also, these CPs are not distinguishing since they don’t necessarily uniquely distinguish a figure from its ground. From an NLG perspective, the problem of appropriately characterising a given area can be specified as the following: 1. Given a set of CPs which apply to the region of interest {c1, c2, ...cn} e.g., {highland, forested, rural, inland}, 2. Apply a procedure pi out of a set of possible procedures P to select the best CP Context-based Spatial Description Selection 5 to describe the region of interest The set of possible procedures used to filter the applicable CPs ranks them in a set of preference orderings. Many possible procedures have been suggested for filtering CPs, many of which have been mainly used in computational approaches to generating REs, among them: 1. Causality, i.e., causal links between the variable, e.g., crime, and CP, e.g., “urban”, (Turner et al., 2008) has been used to filter CPs to describe geo-referenced weather data 2. Procedures motivated by Gricean Maxims which satisfy Grice’s Cooperative Principle (Grice, 1989), e.g.: • Brevity ((Dale, 1992); (Dale & Reiter, 1995); (Gardent, 2002)); typically these approaches favour selecting the briefest referring expressions (REs) • Coverage, i.e., minimising both the number of true negatives and false positives for CP generation (Turner et al., 2008) which can be seen as involving both the maxims of quality and quantity • Context, which indirectly involves the maxim of relation in that it tries to select CPs which are relevant, where discourse context determines relevance
منابع مشابه
Spatial Analysis in curved spaces with Non-Euclidean Geometry
The ultimate goal of spatial information, both as part of technology and as science, is to answer questions and issues related to space, place, and location. Therefore, geometry is widely used for description, storage, and analysis. Undoubtedly, one of the most essential features of spatial information is geometric features, and one of the most obvious types of analysis is the geometric type an...
متن کاملA Differential Evolution and Spatial Distribution based Local Search for Training Fuzzy Wavelet Neural Network
Abstract Many parameter-tuning algorithms have been proposed for training Fuzzy Wavelet Neural Networks (FWNNs). Absence of appropriate structure, convergence to local optima and low speed in learning algorithms are deficiencies of FWNNs in previous studies. In this paper, a Memetic Algorithm (MA) is introduced to train FWNN for addressing aforementioned learning lacks. Differential Evolution...
متن کاملEnvironmental Planning for Wind Power Plant Site Selection using a Fuzzy PROMETHEE-Based Outranking Method in Geographical Information System
Selection of suitable sites for wind power plants is one of the most important decision on wind resources development. Site selection for the establishment of large wind power plants requires spatial evaluation taking technical, economic, and environmental considerations into account. This study has applied a combination of PROMETHEE and Fuzzy AHP methods in a geographical information system en...
متن کاملSpatial Design for Knot Selection in Knot-Based Low-Rank Models
Analysis of large geostatistical data sets, usually, entail the expensive matrix computations. This problem creates challenges in implementing statistical inferences of traditional Bayesian models. In addition,researchers often face with multiple spatial data sets with complex spatial dependence structures that their analysis is difficult. This is a problem for MCMC sampling algorith...
متن کاملComparing Urban Spatial Structures: A Model Selection Approach
People gather together to form a society, of which the physical manifestation is a city. Each city’s urban spatial structure can be considered as an underlying context of current apparent physical space. Considerable literature focusing on the spatial structure of cities has been published to date (Kostof 1993) and many researchers have investigated methods for measuring urban spatial structure...
متن کامل